31 research outputs found
Relative Facial Action Unit Detection
This paper presents a subject-independent facial action unit (AU) detection
method by introducing the concept of relative AU detection, for scenarios where
the neutral face is not provided. We propose a new classification objective
function which analyzes the temporal neighborhood of the current frame to
decide if the expression recently increased, decreased or showed no change.
This approach is a significant change from the conventional absolute method
which decides about AU classification using the current frame, without an
explicit comparison with its neighboring frames. Our proposed method improves
robustness to individual differences such as face scale and shape, age-related
wrinkles, and transitions among expressions (e.g., lower intensity of
expressions). Our experiments on three publicly available datasets (Extended
Cohn-Kanade (CK+), Bosphorus, and DISFA databases) show significant improvement
of our approach over conventional absolute techniques. Keywords: facial action
coding system (FACS); relative facial action unit detection; temporal
information;Comment: Accepted at IEEE Winter Conference on Applications of Computer
Vision, Steamboat Springs Colorado, USA, 201
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
We present CoDi-2, a versatile and interactive Multimodal Large Language
Model (MLLM) that can follow complex multimodal interleaved instructions,
conduct in-context learning (ICL), reason, chat, edit, etc., in an any-to-any
input-output modality paradigm. By aligning modalities with language for both
encoding and generation, CoDi-2 empowers Large Language Models (LLMs) to not
only understand complex modality-interleaved instructions and in-context
examples, but also autoregressively generate grounded and coherent multimodal
outputs in the continuous feature space. To train CoDi-2, we build a
large-scale generation dataset encompassing in-context multimodal instructions
across text, vision, and audio. CoDi-2 demonstrates a wide range of zero-shot
capabilities for multimodal generation, such as in-context learning, reasoning,
and compositionality of any-to-any modality generation through multi-round
interactive conversation. CoDi-2 surpasses previous domain-specific models on
tasks such as subject-driven image generation, vision transformation, and audio
editing. CoDi-2 signifies a substantial breakthrough in developing a
comprehensive multimodal foundation model adept at interpreting in-context
language-vision-audio interleaved instructions and producing multimodal
outputs.Comment: Project Page: https://codi-2.github.io
Salivary Gland Tumors in Maxillofacial Region: A Retrospective Study of 130 Cases in a Southern Iranian Population
Tumors of the salivary glands are uncommon head and neck neoplasia. We conducted a retrospective study of 392 cases over the last 6 years in Shiraz, south of Iran, to investigate the clinicopathological features of these tumors in Iranian population. The age of the patients ranged from 8 to 85 years, with the mean age 44.57 ± 14.65 years and male-to-female (M : F) ratio was 1.02 : 1. For benign tumors, there was a propensity towards females, whereas the malignant tumor was more common in males. The ratio of benign tumors to malignancies was 2.19 : 1. Pleomorphic adenoma (PA) was the most common tumor and accounted for 85% of all benign tumors, followed by Warthin's tumor (8.6%). Of the 125 malignancies, adenoid cystic carcinoma (40%), mucoepidermoid carcinoma (24%) and invasive squamous cell carcinoma (16%) were the most common histological types. Most of the salivary gland tumors (75%) originated from major salivary glands and the remained (25%) originated from minor glands. The parotid gland was the most common site both in benign and malignant tumors. Most of our findings were similar to those in the literature, with some variations. The salivary tumors slightly predominated in males. Adenoid cystic carcinoma and mucoepidermoid carcinoma constituted the most common malignancies